Stochastic Context-Free Grammars and RNA Secondary Structure Prediction
نویسندگان
چکیده
This thesis focus on the prediction of RNA secondary structure using stochastic context-free grammars (SCFG). The RNA secondary structure prediction problem consists of predicting a 2-dimensional structure from a 1-dimensional nucleotide sequence. The theory behind SCFG is explained and an overview of the research literature on various methods in the field of secondary structure prediction is given. Furthermore, a SCFG framework is developed in the JAVA programming language as part of the thesis. This framework is used to test the prediction performance of various grammar constructions on real world RNA data. Some ideas for modelling pseudoknots are discussed and implemented.
منابع مشابه
Introduction to stochastic context free grammars.
Stochastic context free grammars are a formalism which plays a prominent role in RNA secondary structure analysis. This chapter provides the theoretical background on stochastic context free grammars. We recall the general definitions and study the basic properties, virtues, and shortcomings of stochastic context free grammars. We then introduce two ways in which they are used in RNA secondary ...
متن کاملMaximizing Expected Base Pair Accuracy in RNA Secondary Structure Prediction by Joining Stochastic Context-Free Grammars Method
The identification of RNA secondary structures has been among the most exciting recent developments in biology and medical science. Prediction of RNA secondary structure is a fundamental problem in computational structural biology. For several decades, free energy minimization has been the most popular method for prediction from a single sequence. It is based on a set of empirical free energy c...
متن کاملAn evolutionary algorithm for stochastic context-free grammar design, with applications to RNA secondary structure prediction
Stochastic Context-Free Grammars (SCFGs) have been used widely in modelling RNA secondary structure. They were motivated by the use of Hidden Markov Models (HMMs) in protein modelling (Krogh et al., (1993)). What was lacking in HMMs though, was the ability to model long range interactions which are necessary to provide an effective model for RNA secondary structure. Thus, SCFGs, as generalisati...
متن کاملRNA secondary structure prediction using stochastic context-free grammars and evolutionary history
MOTIVATION Many computerized methods for RNA secondary structure prediction have been developed. Few of these methods, however, employ an evolutionary model, thus relevant information is often left out from the structure determination. This paper introduces a method which incorporates evolutionary history into RNA secondary structure prediction. The method reported here is based on stochastic c...
متن کاملRNA secondary structure prediction and runtime optimization
1. Background RNA secondary structure Pseudoknots Non-coding RNA 2. CONTRAfold: Probabilistic RNA folding Overview of the algorithm Details of the algorithm Performance of CONTRAfold 3. Other RNA folding methods: Physics-based models and Stochastic Context Free Grammars Physics-based models Stochastic Context Free Grammars Advantages of CONTRAfold over these other approaches 4. How RNA folding ...
متن کامل